-
Notifications
You must be signed in to change notification settings - Fork 60
docs: aviary, verifiers, reasoning gym env integration docs #617
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: Christian Munley <[email protected]>
|
|
||
| ## Rollout Collection | ||
|
|
||
| ### Start vLLM Server |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we follow the pattern where we are using a hosted model to generate rollouts like the quickstart?
echo "policy_base_url: https://api.openai.com/v1
policy_api_key: your-openai-api-key
policy_model_name: gpt-4.1-2025-04-14" > env.yaml
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unnecessary burden to get a model and serve it with vLLM, right?
|
|
||
| ## Example Usage | ||
|
|
||
| ### GSM8K Environment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then unlike my comment here on Reasoning gym https://github.com/NVIDIA-NeMo/Gym/pull/617/changes#r2800137030 we do not have the "setup steps" before running ng_run
Need one pattern and to follow it
|
|
||
| --- | ||
|
|
||
| ## Start Model Server |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar comment to the other env tutorials about a hosted model raising barrier to entry and consistency with quickstart.
Also this one doesn't have the instruction to pull weights from HF
| ## Start Model Server | ||
|
|
||
| ```bash | ||
| uv add vllm |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are we going with pip or uv? reasoning env has pip install https://github.com/NVIDIA-NeMo/Gym/pull/617/changes#diff-ada604f88b18e8dbff44f513c28f5aad984dc5e3bbbd213d4c1aadd9214350f9R64
No description provided.